The Effect Of Smoothing In Language Models For Novelty Detection
نویسنده
چکیده
The novelty task consists of finding relevant and novel sentences in a ranking of documents given a query. In the literature, different techniques have been applied to address this problem. Nevertheless, little is known about Language Models for novelty detection and, especially, the effect of smoothing on the selection of novel sentences. Language Models can be used to study novelty and relevance in a principled way. These statistical models have been shown to perform well empirically in many Information Retrieval tasks. In this work we study formally the effects of smoothing on novelty detection. To this aim, we compare different techniques based on the Kullback-Leibler divergence and we analyze the sensitivity of retrieval performance to the smoothing parameters. The ability of Language Modeling estimation methods to handle quantitatively the uncertainty associated to the use of natural language is a powerful tool that can drive the future development of noveltybased mechanisms.
منابع مشابه
Novelty detection for topic tracking
Multi-source web news portals provide various advantages such as richness in news content and an opportunity to follow developments from different perspectives. However, in such environments, news variety and quantity can have an overwhelming effect. New event detection and topic tracking studies address this problem. They examine news streams and organize stories according to their events; how...
متن کاملUsing P300 to Evaluate the Effect of Object Color Knowledge in Novelty Detection
A B S T R A C T Introduction: In an oddball experiment, the context in which novel stimuli are presented affects characteristics of novelty P3, i.e. as long as there is a difficult task in which the difference between standard and target stimuli is small, recurrent presentation of a highly discrepant stimulus can lead to P300 highly similar to novelty P3. Effect of stimulus properties on P300 h...
متن کاملDesign and implementation of Persian spelling detection and correction system based on Semantic
Persian Language has a special feature (grapheme, homophone, and multi-shape clinging characters) in electronic devices. Furthermore, design and implementation of NLP tools for Persian are more challenging than other languages (e.g. English or German). Spelling tools are used widely for editing user texts like emails and text in editors. Also developing Persian tools will provide Persian progr...
متن کاملPrediction of global sea cucumber capture production based on the exponential smoothing and ARIMA models
Sea cucumber catch has followed “boom-and-bust” patterns over the period of 60 years from 1950-2010, and sea cucumber fisheries have had important ecological, economic and societal roles. However, sea cucumber fisheries have not been explored systematically, especially in terms of catch change trends. Sea cucumbers are relatively sedentary species. An attempt was made to explore whether the tim...
متن کاملPrediction of global sea cucumber capture production based on the exponential smoothing and ARIMA models
Sea cucumber catch has followed “boom-and-bust” patterns over the period of 60 years from 1950-2010, and sea cucumber fisheries have had important ecological, economic and societal roles. However, sea cucumber fisheries have not been explored systematically, especially in terms of catch change trends. Sea cucumbers are relatively sedentary species. An attempt was made to explore whe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007